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Claims 



We claim: 

1 1 . A method for identifying talking heads in a compressed video, comprising: 

2 extracting motion activity descriptors from each of a plurality of shots; 

3 combining the plurality of motion activity descriptors of each shot, into a 

4 shot motion activity descriptor; 

5 measuring a distance between the shot motion activity descriptor and a 
l s 6 template motion activity descriptor; and 

^7 identifying a particular shot as a talking head if the measured distance is less 

than a predetermined threshold. 

al 2. The method of claim 1 further comprising: 

M2 extracting a plurality of training motion activity descriptors from a training 

M3 video including a plurality of training shots, each training shot including a training 

q4 talking head; and 

5 combining the plurality of training motion activity descriptors into the 

6 template motion activity descriptor. 

1 3 . The method of claim 2 wherein the combining is a median of the plurality of 

2 training motion activity descriptors. 
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1 4. The method of claim 2 wherein the combining is a mean of the plurality of 

2 training motion activity descriptors. 

1 5. The method of claim 1 further comprising: 

2 normalizing the measured distance. 

1 6. The method of claim 1 wherein the threshold is a standard deviation a of the 

2 temple motion activity descriptor. 

1 7. The method of claim 1 wherein each motion activity descriptor is of the form 

%2 C a 2 , N sr , N mr , N lr , <j fr , where C a J is an average motion vector, and N sr , N mr , N lr 

cB are short, medium and long run zero-length motion vectors, respectively. 

ml 8. The method of claim 7 wherein the distance is measured according to: 

k D(S ' T) = ^ff) 1 c - (T) ~ c " (s) 1 + iPtF) 1 N - (T) ' N - (S) 1 

M2 where W, of is a normalizing weight, Tis the template motion activity descriptor, 

3 and S is the shot motion activity descriptor. 

1 9. The method of claim 1 further comprising: 

2 measuring a distance between the shot motion activity descriptor and a set of 

3 template motion activity descriptors. 
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1 10. The method of claim 1 wherein the distance is a semi-Hausdorff distance. 

1 11. The method of claim 1 wherein the template motion activity is modeled by a 

2 discrete function. 

1 12. The method of claim 1 wherein the template motion activity is modeled by a 

2 continuous function. 

1 13. The method of claim 12 wherein the continuous function is a mixture of 

2 Gaussian distributions. 

2l 14. The method of claim 1 further comprising: 

[{jZ extracting a plurality of training motion activity descriptors from sampled 

3 frames of a training video including a plurality of training shots, each training shot 

*4 including a training talking head; and 

H5 combining the plurality of training motion activity descriptors into the 

M6 template motion activity descriptor. 

1 15. The method of claim 1 further comprising: 

2 segmenting the video into the plurality of shots using the motion activity 

3 descriptors. 

1 16. The method of claim 1 further comprising: 

2 retaining only talking head shots. 
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